Research Methods in Developmental Linguistics – Week 5

Dr Stefano Coretta

University of Edinburgh

Case study: Ota 2009

  • Data from Ota 2009.

  • L2 lexical representation of “near-homophones”: ROCK/LOCK for Japanese speakers.

Ota 2009: design

  • Semantic-relatedness task.

  • Pairs of written words presented visually (control and experimental).

    • Homophones: SIN-MOON, SON-MOON

    • Near-homophones: SOCK-KEY, ROCK-KEY

    • Minimal pairs: FEAR-PAW, PEAR-PAW

Research hypotheses

H1. The difference in RTs between unrelated and control is the same in homophones (H) and near-homophones (LR).

H2. There is no difference in RTs in minimal pairs (PB).

Ota 2009: the data

ota2009 <- read_csv("data/ota2009/key-rock.csv") |>
  filter(
    Procedure == "TrialProc", Contrast != "F"
  ) |>
  mutate(
    Subject = as.factor(Subject),
    RT_log = log(Words.RT),
    Item_id = paste(Version, Contrast, Item, sep = "_")
  )
Rows: 7538 Columns: 10
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
chr (6): Procedure, Version, Contrast, Condition, WordL, WordR
dbl (4): Subject, Item, Words.ACC, Words.RT

ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.

Ota 2009: the data

ota2009

Ota 2009: RTs

Figure 1: Logged RTs in three contrasts and two conditions.

Log-normal regression (RTs)

my_seed <- 9283

ota_bm_1 <- brm(
  # Word.RT ~ Condition + Contrast + Condition:Contrast
  Words.RT ~ Condition * Contrast,
  family = lognormal,
  data = ota2009,
  seed = my_seed,
  cores = 4,
  file = "data/cache/ota_bm_1"
)

Log-normal regression: summary

summary(ota_bm_1, prob = 0.9)
 Family: lognormal 
  Links: mu = identity 
Formula: Words.RT ~ Condition * Contrast 
   Data: ota2009 (Number of observations: 2338) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
                              Estimate Est.Error l-90% CI u-90% CI Rhat
Intercept                         7.63      0.03     7.58     7.67 1.00
ConditionUnrelated                0.10      0.04     0.04     0.17 1.00
ContrastLR                        0.07      0.04     0.00     0.13 1.00
ContrastPB                       -0.00      0.04    -0.07     0.06 1.00
ConditionUnrelated:ContrastLR    -0.04      0.05    -0.13     0.05 1.00
ConditionUnrelated:ContrastPB    -0.06      0.05    -0.16     0.02 1.00
                              Bulk_ESS Tail_ESS
Intercept                         2361     2794
ConditionUnrelated                2072     2393
ContrastLR                        2399     2680
ContrastPB                        2343     2626
ConditionUnrelated:ContrastLR     2228     2617
ConditionUnrelated:ContrastPB     2176     2715

Further Distributional Parameters:
      Estimate Est.Error l-90% CI u-90% CI Rhat Bulk_ESS Tail_ESS
sigma     0.54      0.01     0.52     0.55 1.00     3793     2732

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Log-normal regression: expected values

conditional_effects(ota_bm_1, effects = "Contrast:Condition")

BUT…

  • Multiple observations from different participants.

  • Multiple observations from different item lists.

  • We need to include varying terms (also known as random or multilevel effects).

  • Regression models with varying terms are variably known as mixed-effects, multilevel, hierarchical, nested… They are all the same thing.

By-subject RTs

Figure 2: Log-RTs in the LR condition by subject

By-list RTs

Figure 3: Log-RTs in the LR condition by list

By-subject varying intercept

ota_bm_2 <- brm(
  Words.RT ~ 1 + (1 | Subject),
  family = lognormal,
  data = ota2009,
  seed = my_seed,
  cores = 4,
  file = "data/cache/ota_bm_2"
)

Model summary

summary(ota_bm_2, prob = 0.9)
 Family: lognormal 
  Links: mu = identity 
Formula: Words.RT ~ 1 + (1 | Subject) 
   Data: ota2009 (Number of observations: 2338) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~Subject (Number of levels: 20) 
              Estimate Est.Error l-90% CI u-90% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept)     0.35      0.06     0.26     0.47 1.02      362      782

Regression Coefficients:
          Estimate Est.Error l-90% CI u-90% CI Rhat Bulk_ESS Tail_ESS
Intercept     7.68      0.08     7.55     7.81 1.01      262      444

Further Distributional Parameters:
      Estimate Est.Error l-90% CI u-90% CI Rhat Bulk_ESS Tail_ESS
sigma     0.43      0.01     0.42     0.44 1.00     1307     1856

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Model draws

ota_bm_2_draws <- as_draws_df(ota_bm_2)
ota_bm_2_draws

Model plot

plot(ota_bm_2)

By-subject varying intercept and slope

ota_bm_3 <- brm(
  Words.RT ~ Condition +
    (Condition | Subject),
  family = lognormal,
  data = ota2009,
  seed = my_seed,
  cores = 4,
  file = "data/cache/ota_bm_3"
)

Model summary

summary(ota_bm_3, prob = 0.9)
 Family: lognormal 
  Links: mu = identity 
Formula: Words.RT ~ Condition + (Condition | Subject) 
   Data: ota2009 (Number of observations: 2338) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~Subject (Number of levels: 20) 
                                  Estimate Est.Error l-90% CI u-90% CI Rhat
sd(Intercept)                         0.35      0.07     0.26     0.48 1.01
sd(ConditionUnrelated)                0.04      0.03     0.00     0.09 1.00
cor(Intercept,ConditionUnrelated)     0.06      0.48    -0.75     0.85 1.00
                                  Bulk_ESS Tail_ESS
sd(Intercept)                          790     1558
sd(ConditionUnrelated)                1261     2038
cor(Intercept,ConditionUnrelated)     3515     1858

Regression Coefficients:
                   Estimate Est.Error l-90% CI u-90% CI Rhat Bulk_ESS Tail_ESS
Intercept              7.64      0.08     7.51     7.77 1.00      674      904
ConditionUnrelated     0.07      0.02     0.03     0.10 1.00     4384     2558

Further Distributional Parameters:
      Estimate Est.Error l-90% CI u-90% CI Rhat Bulk_ESS Tail_ESS
sigma     0.43      0.01     0.42     0.44 1.00     5986     2868

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Model draws

ota_bm_3_draws <- as_draws_df(ota_bm_3)
ota_bm_3_draws

Expected values

conditional_effects(ota_bm_3)
Ignoring unknown labels:
• fill : "NA"
• colour : "NA"
Ignoring unknown labels:
• fill : "NA"
• colour : "NA"

By-subject varying intercept and slopes

ota_bm_4 <- brm(
  Words.RT ~ Condition * Contrast +
    (Condition * Contrast | Subject),
  family = lognormal,
  data = ota2009,
  seed = my_seed,
  cores = 4,
  file = "data/cache/ota_bm_4"
)

Model summary

summary(ota_bm_4, prob = 0.9)
 Family: lognormal 
  Links: mu = identity 
Formula: Words.RT ~ Condition * Contrast + (Condition * Contrast | Subject) 
   Data: ota2009 (Number of observations: 2338) 
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Multilevel Hyperparameters:
~Subject (Number of levels: 20) 
                                                                 Estimate
sd(Intercept)                                                        0.37
sd(ConditionUnrelated)                                               0.04
sd(ContrastLR)                                                       0.04
sd(ContrastPB)                                                       0.05
sd(ConditionUnrelated:ContrastLR)                                    0.07
sd(ConditionUnrelated:ContrastPB)                                    0.05
cor(Intercept,ConditionUnrelated)                                    0.04
cor(Intercept,ContrastLR)                                           -0.14
cor(ConditionUnrelated,ContrastLR)                                   0.07
cor(Intercept,ContrastPB)                                           -0.03
cor(ConditionUnrelated,ContrastPB)                                  -0.09
cor(ContrastLR,ContrastPB)                                           0.08
cor(Intercept,ConditionUnrelated:ContrastLR)                        -0.16
cor(ConditionUnrelated,ConditionUnrelated:ContrastLR)               -0.03
cor(ContrastLR,ConditionUnrelated:ContrastLR)                        0.03
cor(ContrastPB,ConditionUnrelated:ContrastLR)                        0.05
cor(Intercept,ConditionUnrelated:ContrastPB)                         0.13
cor(ConditionUnrelated,ConditionUnrelated:ContrastPB)               -0.10
cor(ContrastLR,ConditionUnrelated:ContrastPB)                       -0.01
cor(ContrastPB,ConditionUnrelated:ContrastPB)                       -0.07
cor(ConditionUnrelated:ContrastLR,ConditionUnrelated:ContrastPB)     0.01
                                                                 Est.Error
sd(Intercept)                                                         0.07
sd(ConditionUnrelated)                                                0.03
sd(ContrastLR)                                                        0.03
sd(ContrastPB)                                                        0.03
sd(ConditionUnrelated:ContrastLR)                                     0.04
sd(ConditionUnrelated:ContrastPB)                                     0.03
cor(Intercept,ConditionUnrelated)                                     0.33
cor(Intercept,ContrastLR)                                             0.36
cor(ConditionUnrelated,ContrastLR)                                    0.37
cor(Intercept,ContrastPB)                                             0.34
cor(ConditionUnrelated,ContrastPB)                                    0.37
cor(ContrastLR,ContrastPB)                                            0.39
cor(Intercept,ConditionUnrelated:ContrastLR)                          0.33
cor(ConditionUnrelated,ConditionUnrelated:ContrastLR)                 0.37
cor(ContrastLR,ConditionUnrelated:ContrastLR)                         0.37
cor(ContrastPB,ConditionUnrelated:ContrastLR)                         0.37
cor(Intercept,ConditionUnrelated:ContrastPB)                          0.37
cor(ConditionUnrelated,ConditionUnrelated:ContrastPB)                 0.37
cor(ContrastLR,ConditionUnrelated:ContrastPB)                         0.38
cor(ContrastPB,ConditionUnrelated:ContrastPB)                         0.38
cor(ConditionUnrelated:ContrastLR,ConditionUnrelated:ContrastPB)      0.37
                                                                 l-90% CI
sd(Intercept)                                                        0.27
sd(ConditionUnrelated)                                               0.00
sd(ContrastLR)                                                       0.00
sd(ContrastPB)                                                       0.00
sd(ConditionUnrelated:ContrastLR)                                    0.01
sd(ConditionUnrelated:ContrastPB)                                    0.00
cor(Intercept,ConditionUnrelated)                                   -0.50
cor(Intercept,ContrastLR)                                           -0.68
cor(ConditionUnrelated,ContrastLR)                                  -0.56
cor(Intercept,ContrastPB)                                           -0.59
cor(ConditionUnrelated,ContrastPB)                                  -0.68
cor(ContrastLR,ContrastPB)                                          -0.57
cor(Intercept,ConditionUnrelated:ContrastLR)                        -0.66
cor(ConditionUnrelated,ConditionUnrelated:ContrastLR)               -0.63
cor(ContrastLR,ConditionUnrelated:ContrastLR)                       -0.59
cor(ContrastPB,ConditionUnrelated:ContrastLR)                       -0.57
cor(Intercept,ConditionUnrelated:ContrastPB)                        -0.50
cor(ConditionUnrelated,ConditionUnrelated:ContrastPB)               -0.68
cor(ContrastLR,ConditionUnrelated:ContrastPB)                       -0.63
cor(ContrastPB,ConditionUnrelated:ContrastPB)                       -0.66
cor(ConditionUnrelated:ContrastLR,ConditionUnrelated:ContrastPB)    -0.61
                                                                 u-90% CI Rhat
sd(Intercept)                                                        0.49 1.00
sd(ConditionUnrelated)                                               0.10 1.00
sd(ContrastLR)                                                       0.09 1.00
sd(ContrastPB)                                                       0.11 1.00
sd(ConditionUnrelated:ContrastLR)                                    0.14 1.00
sd(ConditionUnrelated:ContrastPB)                                    0.11 1.00
cor(Intercept,ConditionUnrelated)                                    0.57 1.00
cor(Intercept,ContrastLR)                                            0.48 1.00
cor(ConditionUnrelated,ContrastLR)                                   0.67 1.00
cor(Intercept,ContrastPB)                                            0.55 1.00
cor(ConditionUnrelated,ContrastPB)                                   0.55 1.00
cor(ContrastLR,ContrastPB)                                           0.68 1.00
cor(Intercept,ConditionUnrelated:ContrastLR)                         0.40 1.00
cor(ConditionUnrelated,ConditionUnrelated:ContrastLR)                0.61 1.00
cor(ContrastLR,ConditionUnrelated:ContrastLR)                        0.63 1.00
cor(ContrastPB,ConditionUnrelated:ContrastLR)                        0.63 1.00
cor(Intercept,ConditionUnrelated:ContrastPB)                         0.71 1.00
cor(ConditionUnrelated,ConditionUnrelated:ContrastPB)                0.53 1.00
cor(ContrastLR,ConditionUnrelated:ContrastPB)                        0.61 1.00
cor(ContrastPB,ConditionUnrelated:ContrastPB)                        0.58 1.00
cor(ConditionUnrelated:ContrastLR,ConditionUnrelated:ContrastPB)     0.62 1.00
                                                                 Bulk_ESS
sd(Intercept)                                                        1355
sd(ConditionUnrelated)                                               1820
sd(ContrastLR)                                                       2509
sd(ContrastPB)                                                       1763
sd(ConditionUnrelated:ContrastLR)                                    1487
sd(ConditionUnrelated:ContrastPB)                                    2509
cor(Intercept,ConditionUnrelated)                                    6611
cor(Intercept,ContrastLR)                                            6265
cor(ConditionUnrelated,ContrastLR)                                   4357
cor(Intercept,ContrastPB)                                            7466
cor(ConditionUnrelated,ContrastPB)                                   3606
cor(ContrastLR,ContrastPB)                                           3538
cor(Intercept,ConditionUnrelated:ContrastLR)                         6014
cor(ConditionUnrelated,ConditionUnrelated:ContrastLR)                4019
cor(ContrastLR,ConditionUnrelated:ContrastLR)                        3376
cor(ContrastPB,ConditionUnrelated:ContrastLR)                        3374
cor(Intercept,ConditionUnrelated:ContrastPB)                         8227
cor(ConditionUnrelated,ConditionUnrelated:ContrastPB)                4609
cor(ContrastLR,ConditionUnrelated:ContrastPB)                        4686
cor(ContrastPB,ConditionUnrelated:ContrastPB)                        3919
cor(ConditionUnrelated:ContrastLR,ConditionUnrelated:ContrastPB)     3631
                                                                 Tail_ESS
sd(Intercept)                                                        1820
sd(ConditionUnrelated)                                               2440
sd(ContrastLR)                                                       2614
sd(ContrastPB)                                                       2249
sd(ConditionUnrelated:ContrastLR)                                    2033
sd(ConditionUnrelated:ContrastPB)                                    2661
cor(Intercept,ConditionUnrelated)                                    3172
cor(Intercept,ContrastLR)                                            2995
cor(ConditionUnrelated,ContrastLR)                                   3364
cor(Intercept,ContrastPB)                                            3082
cor(ConditionUnrelated,ContrastPB)                                   3187
cor(ContrastLR,ContrastPB)                                           3192
cor(Intercept,ConditionUnrelated:ContrastLR)                         3094
cor(ConditionUnrelated,ConditionUnrelated:ContrastLR)                3312
cor(ContrastLR,ConditionUnrelated:ContrastLR)                        3024
cor(ContrastPB,ConditionUnrelated:ContrastLR)                        3652
cor(Intercept,ConditionUnrelated:ContrastPB)                         2671
cor(ConditionUnrelated,ConditionUnrelated:ContrastPB)                3302
cor(ContrastLR,ConditionUnrelated:ContrastPB)                        3492
cor(ContrastPB,ConditionUnrelated:ContrastPB)                        3482
cor(ConditionUnrelated:ContrastLR,ConditionUnrelated:ContrastPB)     3511

Regression Coefficients:
                              Estimate Est.Error l-90% CI u-90% CI Rhat
Intercept                         7.63      0.09     7.49     7.78 1.00
ConditionUnrelated                0.10      0.03     0.05     0.15 1.00
ContrastLR                        0.07      0.03     0.01     0.12 1.00
ContrastPB                       -0.01      0.03    -0.06     0.05 1.00
ConditionUnrelated:ContrastLR    -0.04      0.05    -0.12     0.04 1.00
ConditionUnrelated:ContrastPB    -0.06      0.04    -0.13     0.01 1.00
                              Bulk_ESS Tail_ESS
Intercept                          820     1514
ConditionUnrelated                4094     3360
ContrastLR                        4295     2723
ContrastPB                        4510     3191
ConditionUnrelated:ContrastLR     4317     3249
ConditionUnrelated:ContrastPB     4524     3497

Further Distributional Parameters:
      Estimate Est.Error l-90% CI u-90% CI Rhat Bulk_ESS Tail_ESS
sigma     0.43      0.01     0.42     0.44 1.00     7787     2574

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

Expected values

conditional_effects(ota_bm_4, "Contrast:Condition")

Expected values (without varying terms)

conditional_effects(ota_bm_1, effects = "Contrast:Condition")

Constant (fixed) effects

fixef(ota_bm_4, probs = c(0.05, 0.95)) |> round(2)
                              Estimate Est.Error    Q5  Q95
Intercept                         7.63      0.09  7.49 7.78
ConditionUnrelated                0.10      0.03  0.05 0.15
ContrastLR                        0.07      0.03  0.01 0.12
ContrastPB                       -0.01      0.03 -0.06 0.05
ConditionUnrelated:ContrastLR    -0.04      0.05 -0.12 0.04
ConditionUnrelated:ContrastPB    -0.06      0.04 -0.13 0.01
fixef(ota_bm_1, probs = c(0.05, 0.95)) |> round(2)
                              Estimate Est.Error    Q5  Q95
Intercept                         7.63      0.03  7.58 7.67
ConditionUnrelated                0.10      0.04  0.04 0.17
ContrastLR                        0.07      0.04  0.00 0.13
ContrastPB                        0.00      0.04 -0.07 0.06
ConditionUnrelated:ContrastLR    -0.04      0.05 -0.13 0.05
ConditionUnrelated:ContrastPB    -0.06      0.05 -0.16 0.02

Effect of condition in PB: calculate

ota_bm_4_draws <- as_draws_df(ota_bm_4) |> 
  mutate(
    unrel_h = b_ConditionUnrelated,
    unrel_lr = b_ConditionUnrelated + `b_ConditionUnrelated:ContrastLR`,
    unrel_pb = b_ConditionUnrelated + `b_ConditionUnrelated:ContrastPB`
  )

quantile2(ota_bm_4_draws$unrel_pb) |> round(2)
   q5   q95 
-0.02  0.10 

Effect of condition in PB: plot

Figure 4: Effect of ‘unrelated’ in three contrasts.

Difference of effect of condition in H and LR

ota_bm_4_draws <- ota_bm_4_draws |> 
  mutate(
    unrel_h_lr = unrel_h - unrel_lr
  )

quantile2(ota_bm_4_draws$unrel_h_lr) |> round(2)
   q5   q95 
-0.04  0.12 

Results overview

H1. The difference in RTs between unrelated and control is the same in homophones (H) and near-homophones (LR).

  • Not enough evidence to assess (90% CrI [-0.4, 0.12]).

H2. There is no difference in RTs in minimal pairs (PB).

  • Not enough evidence to assess (90% CrI [-0.02, 0.10]).

Summary

  • You should include varying terms if data is “hierarchical”, for example repeated measures from subjects or items.

  • Not including varying terms (wrongly) inflates posterior certainty.

  • Frequentist regression models with lme4 often don’t converge and researchers simplify the hierarchical structure of the model.